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Abstract 

A general hidden state random walk model is proposed to describe the 
movement of an animal that takes into account movement taxis with re¬ 
spect to featnres of the environment. A circular-linear process models the 
direction and distance between two consecutive localizations of the animal. 

A hidden process structure accounts for the animal’s change in movement 
behavior. The originality of the proposed approach is that several environ¬ 
mental targets can be included in the directional model. An EM algorithm 
that enables prediction of the hidden states of the process is devised to fit 
this model. An application to the analysis of the movement of caribou in 
Canada’s boreal forest is presented. 
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List of symbols 

Data 

T number of observed animal’s locations. 

Ut direction between the animal’s locations at time steps t and t + 1 

yo:T set of all observed directions yo,... ,yT 

dt distance between the animal’s locations at time steps t and t + 1 

(io:T set of all observed directions do,... ,dT 

Xit value of ith explanatory angle variable 

Zit value of ith explanatory real variable 

observed information: directions, distances and explanatory variables 
gathered form time 0 up to time t 

Hidden process 

St state (behavior) in which the animal is at time step t 

Skt indicator function equal to 1 if S'* = /c and 0 otherwise 

S'o:T set of all states Sq, ..., St 

complete information: information in and the hidden states 
So,...,St 

p{St\J^f_i) conditional probability mass of the hidden state St 
Tihk state transition probability P(5'i = k\St-i = h) 

Observed trajectory 

h{]jt, dt\St, Jy'Li)conditional joint density of the observed data {jjt, dt) 
f{yt\St,J^t_i) conditional density of the direction yt 

fk density of the direction yt given that the hidden state is k at time t 

vector of parameter of fk 
mean direction of the von Mises density fk 
concentration parameters of the von Mises density fk 
g{dt\St, conditional density of the distance dt 

gk density of the distance dt given that the hidden state is k at time t 

\ shape and scale parameters of gk 
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1 Introduction 


In animal ecology, being able to understand and model the movement of animals is 
fundamental ([!])• For example, animal behaviorists want to see to what extent an¬ 
imals have preferred movement directions or are attracted towards several environ¬ 
mental targets, such as food-rich patches and previously visited locations (spatial 
memory effect) ([2])- The development of Global Positioning System (GPS) tech¬ 
nology permits the collection of a large amount of data on animal movement. This 
can be combined to data available from geographic information systems (GIS) 
to investigate how the environment influences animal displacement. To achieve 
this goal, robust statistical techniques and flexible animal movement models are 
required. 

Discrete time models for animal movement are actively being developed and 
investigated m)- Because displacement in discrete time can be characterized 
by the distance and the direction between two consecutive localizations, circular- 
linear processes can be used to model movement in 2D. A basic model is the biased 
correlated random walk (BGRW) of [1]; it predicts the next motion angle as a 
comprise between the current one (often called directional persistence if the bias 
is towards zero) and the direction towards a specific target (also called directional 
bias). Several authors have built models to adapt or generalize the BGRW so that 
it can be applied in different contexts. |3] directly model the {x, ?/)-coordinates 
at a given time-step as a function of coordinates at the previous time-step with 
bivariate normal distributions to deal with data acquired at irregular time intervals. 
An alternative formulation of this model is proposed by [6] , but this formulation of 
the BGRW does not allow to include multiple directional biases. Despite the rapid 
increase in the development of movement analysis, most quantitative techniques 
still consider only two directional targets when estimating the mean direction of 
BGRWs. However, habitat selection studies demonstrate that animal movement 
can be influenced simultaneously by more than one or two environmental features 
m)- Our first generalization of the BGRW is therefore to make the mean direction 
of the process depend upon several directional targets. To do so, we embed the 
directional model recently proposed by [H] within the BGRW and show how it is 
easily interpretable. We also show that its estimation is numerically more stable 
than other BGRW models. 

Often, the movement trajectory of animals involves multiple movement states 
or behaviors (i). For instance, in their analysis of bison movement, [10] iden- 
tihed two states, “exploratory” and “encamped”. The former has long traveled 
distances and turning angles between two consecutive locations that tend to be 
concentrated around zero, while the latter one is characterized by short distances 
and almost uniformly distributed turning angles. Multiple movement behaviors 
can be accounted for by introducing hidden states in the models. [H] give a gen- 
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eral presentation of these models and n, 0, n and HDl are examples of the 
use of hidden state models to analyze angular-distance data in ecological appli¬ 
cations. The second main contribution of our work is to introduce more flexible 
hidden state models that can accommodate directional persistence as well as the 
simultaneous influence of several environmental targets that can vary from state 
to state. Further, by using the EM algorithm to £t the model, we are able to 
compute the posterior probabilities of the hidden state for each step of the an¬ 
imal’s trajectory. Because these probabilities take into account the targets that 
are important in each hidden state, they can be used to understand the relative 
roles of these individual targets on the overall movement and space-use patterns 
of individuals. They can also serve as input values in movement simulations, such 
as individual-based movement models (e.g., m)- Finally, these probabilities can 
highlight some regions in the landscape to be identihed as patches of interest. 

The proposed model for animal motion data, a multi-state circular-linear pro¬ 
cess, is introduced in Section 2. Each state has its own angular regression model 
featuring several environmental targets and directional persistence as introduced 
in |H]. The new model is not a Hidden Markov Model (HMM); it belongs to a wider 
class called switching Markov models investigated in na and [15]. Its parameters 
are shown to be identihable. We use the EM algorithm to maximize the likelihood, 
using a hltering smoothing algorithm (see [l5]), to carry out the expectation step; 
details are given in Section 3. Section 4 investigates the hnite sample properties of 
the estimators by simulation. Section 5 shows an application of the method to the 
analysis of the movement of caribou in the Cote-Nord region of Quebec, Canada. 
Section 6 concludes the paper with a discussion. 

2 A General Multi-State Random Walk Model 

Let us suppose that we follow an animal equipped with a GPS collar which provides 
the animal location at regular time intervals, for example every 4 hours. Additional 
geographic information about multiple habitat features is available. The data set 
consists of the time series 


{{yt,dt,Xt,Zt),t = 0,...,T}, (1) 

where yt G [0, 27r) and dt > 0 represent the direction (bearing) and the distance, 
respectively, between the animal’s location at time step t and time step t + 1 
and Xt = (xit ,..., Xpt) (resp. Zt = {zu ,..., Zpt) ) are the values of p explanatory 
angular (resp. real) variables measured that are potentially useful to predict pt or 
dt- The explanatory variables Xit,zu,i = 1, ... ,p are associated to the directions 
to and the distances from these targets with respect to the position of the animal 
at time t — 1. Explanatory variable zu can also be an indicator variable, see the 
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[D} For instance in the application of Section 5, Xt = {xitTX 2 t) are the angles of 
the directions to the closest regenerating wood cut and the direction to the closest 
“patch” visited by the animal in the past, respectively. We also denote the set of 
all observed directions and distances by (|/o:T, doir) = {(l/o dj), t = 0,..., T}. We 
often need to condition on all the information (directions, distances, explanatory 
variables) gathered from time 0 up to time t; we denote this information by the 
hltration Our goal is to develop a suitable model for this type of data, along 
with the associated inference procedures. 

2.1 A General Hidden State Model 

Animals tend to adopt different movement behaviors at different times (i)- Clearly, 
such a change in behavior implies a change in the distribution of the values of the 
observed directions and distances. Because the animal’s behavioral state over time 
is unobserved, we consider here a hidden process S'*, with t = 0,... ,T, that rep¬ 
resents the state (behavior) in which the animal is at time step t. We denote 
by {1, ..., iF} the set of possible states of St, we put S'o:t = {5'o, ..., St}- Con¬ 
ceptually, it is useful to dehne quantities that depend on both the observed and 
unobserved data. To this end, we dehne the complete data hltration as the 
hltration generated by the observed data hltration J^° and the hidden information 
up to time t. 

The joint density of the complete data is 

T 

f{yO:T,do:T,So:T) = l[p (StlJ^ll) h [Vt, dt\St, , ( 2 ) 

t=l 

where p and h represent the densities of the hidden and observed data, respectively. 
The next section proposes special cases of ([^ appropriate for animal movement 
data. 

2.2 A General Directional Random Walk Model 

In this section, we present some new angular-distance specihcations for the joint 
density of (|^. The proposal relies on the following assumptions: 

(Al). Given the hidden process Sq-,t, the observed processes j/oj and do:T are in¬ 
dependent, i.e., 

h{yt,dt\St,rt_,) = f{yt\St, rt_Mdt\St,:FU).t = I,...,T. (3) 

Moreover we suppose that the observed processes ?/o:T and do:T are Markovian 
of order 1 with respect to the hidden process St,t = 1,... ,T: 
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(A2). Given the hidden process Sq-t, we suppose that 

= 1,... ,T. (4) 

The Markovian assumption of order one is given by the distribution of yt 
only depend on the present hidden state St and not St-i,... ,So. 

(A3). Given the hidden process Sq.t, we suppose that 

g{dt\St,J^l,)=gidt\St),t = l,...,T. 


By assumption (A3), the distance dt is independent of dts, s = 1,..., f for every 
time step t = 1,... ,T. We made this assumption for computational time to be 
reasonable, otherwise we would have to model the distances as an autoregressive 
process. Let gk denote the density of dt given that the hidden process is in state 
k at time t. For the observed directions yo-.T^ according to Assumption (A2), 
f{yt\St^ J^t-i) depends on St but also on {yt-s}s<t and on environmental variables 
observed in J^t-i- Let fk{.\J^t-i) Le the density of yt given the information in J^t-i 
knowing that the hidden process is in state k at time t. 

We now propose specihc parametric forms for the functions fk and gk. For gk, 
any density function on the positive real line can be used. We use (as [TU] ) Weibull 
and gamma distributions in the data analysis section, while we use an exponential 
distribution for the simulation study because estimation of its parameter is faster. 
We denote by the parameters of the density gk. For instance in the applica¬ 
tion of Section 5, where and A^^^ denote respectively the 

shape and the scale parameters of a gamma distribution. The construction of the 
conditional circular densities is discussed next. 

Circular multivariate regression model 

Gircular regression models for BGRW ([10] Appendix G, [16]) express yt as a von 
Mises distributed with mean direction depending on yt-i and on other explanatory 
angles plus a homogenous error whose distribution depends on a hxed concentra¬ 
tion parameter k. [H] show that the log-likelihood for estimating the parameters 
of such models is often multi-modal, making parameter estimation problematic. 
Multimodal log-likelihoods also occur in multi-state models when the errors are 
assumed to be homogenous. This is illustrated in G.4 The solution to avoid these 
multimodal log-likelihoods is to adopt a consensus error model as dehned in [H]. 
Knowing that the animal is in state k, a consensus error model for yt depends on 
the vector 


-fr{k) (k) 

Vt =Ky 


cos{yt-i) \ ,^(k) ( cos{xit) 

sin(?/t_i) ) sm{xit) 

' 1 = 1 ^ 



(5) 
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where ..., are unknown parameters depending on the state k. 

(k) 

The mean direction, denoted fil , of the von Mises distribution is the direction of 

ffc) (k) 

VI . The parameters k] ^ 'M. quantify the influence of target i on the animal’s 
direction. When all = 0, i = 1,... ,p, then = yt-i and the animal tends 
to move in the direction of its previous step; the model then simplihes to the 
correlated random walk model (CRW). Conversely, if target i is highly attractive, 
then is large. Similarly, a strongly negative value of means that the 
target i has a repulsive effect and the animal tends to move away from it. We can 
remark that the vector (|^ can depend on angular explanatory variables (xu) or 
real variables {zu) as explained at the beginning of Section 2. 

In a consensus model, the concentration parameter, denoted , of the von 
Mises distribution is the length of v[^\ Thus the concentration parameter depends 
on the level of agreement between the various directional targets. If all targets 
and Ht-i point in the same direction, then the concentration is large and the 
distribution of yt is concentrated around the mean direction y). . Under these 
assumptions, we can write the density fk{,yt\^t-i) 






(U^ 


exp <! cos{yt - yf’) [>, t = 1,..., T, (6) 




where Jq is the modihed Bessel function of order 1 (im. p. 36). As shown by [8] 
and [16] , since and 4*^^ are the direction and length of the same vector § and 
@ give 




exp <1 cos{yt - yt-i) + ^ zu cos{yt - xu) ^ , 


27r/o(£ 


(U^ 


(k) 


2 = 1 


for f = 1,... ,T. This parametrization yields a numerically stable model, as the 
are now the canonical parameters of a distribution in the exponential family. 

We recover many models proposed in the literature as special cases of this von 
Mises consensus model. A special case with a single state (p (S'i= 1 in 
Q) is for example Breckling (1989), who proposed an autoregressive consensus 
von Mises distribution where xu = yt-i-i, i = 0,1,... ,p — 1. Another example 
is the biased correlated random walk model (BCWR, see for example |1|). It has 
h (^yt,dt\St,J^^_i) = f{yt\yt-i,Xit), see also [151 for additional discussion of the 
consensus model when p = 1. 

In a multistate framework, the covariate free model with and 

4^^ = is the HMM studied by [T3|. The HMM and hidden semi-Markov 
model (HSMM) considered by [lO] are also special cases of (j^ where p = 

p {St\St-i) and h [yt, dt\St, XFt-i) = f{yt\yt-i,xo, St)g{dt\St), with xq a bias towards 
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a fixed location. Models such as ([^ apply beyond animal movement. [IH] use (|^ 
to model a wind time series, where yt is the wind direction and dt is the wind speed. 
They have, p {St\T^_^) = p {St\St-uyt-i) and h {yt,dt\St,Tt_i) = f{yt\St,Tt-i)- 

2.3 Markov and Semi-Markov Processes for the Hidden 
States 

The model that we propose for the hidden process S'o:t is a homogeneous Markov 
chain. At any time step t, the animal is in one of K possible states {1,..., A"} 
and p(S'tlJy'Li) = p{St\St-i)- It is convenient to describe Sq,t as a homogeneous 
multinomial process in discrete time. Let us dehne the sequence {St, t = 0,... ,T} 
of multinomial vectors St = {Su ,..., Sxt) where we set Sjt = 1 and Sj't = 0 for 
all j' 7 ^ j when St = j, j,j' = 1,... ,K. At time t = 0, we set P(S'fco = 1) = 
{'Ko)k, such that (tto)*: > 0, k = and = 1- For the rest of 

the development we suppose that the initial distribution of the hidden process 
{{'^o)k,k = is known. We also introduce the transition probabilities 

TThk = P(*S'h = = 1), for h, fc = 1,..., iL. The contribution of the hidden 

process to the complete data density given by (|^ is a function of S'o:t and of the 
transition probabilities: 

i=l t=lh=lk=l 

Although the methodology presented in this paper works for any number of states 
K, we focus on iL = 2 in the simulation and data analysis sections. In the case 
A = 2, we use the notation introduced below: tth = 1 — qi and 1122 = 1 — 5 ' 2 - 

A semi-Markov extension. 

In some applications (for example, the study of the movement of bison, [ID]), a 
semi-Markov hidden process is more realistic. In a semi-Markov model, p{St\J^{_{) = 
p{St\St-i,Tt-i), where Tt-i is the animal’s dwell-time in the state that it occupies 
at time step t — 1, i.e., the number of consecutive time steps spent in that state. 

Following [in], any semi-Markov process can be approximated to a high degree 
of accuracy by a Markov process with an enlarged set of states. Each state of the 
approximating Markov process corresponds to a pair (S', Q) where S = 1 ,..., A 
is the state of the animal and Q = 1,... ,m is the number of time points since 
the animal has arrived in this state. In the numerical example section we consider 
two states, S = 1,2, and assume that the two dwell time distributions are shifted 
negative binomial distributions with parameters that depend on S. The transition 
probabilities are then denoted '^{g,k){h,e){ng,qg), g,h = 1,2, where nh,qh denote 
the parameters (size and probability, respectively) of the dwell time distribution 
in state h. The quantity 'n'(g,k){h,e){ng,qg) denotes the probability that the animal 


has been in the state g for k consecntive time steps and go into the state h for 
i consecntive time steps. \i = 1 for h = 1,2, then the semi-Markov process 
rednces to a Markov process. Given that one is in state {g,k), the probability of 
staying in state g is then 'n'(g^k)(g,k+i){^,(lg) = '^ggi^yQg) = ^ ~ Qg- Details of this 
approximation are given in the 

2.4 Global Model Properties 

Figure [T] summarizes the dependence structure between the hidden process, the 
explanatory variables and the observed bivariate process dehned by Q. We note 
that, in Figuregiven St and S'f_i is independent of the observed data from 
time t to T (i.e. {Jy°,_s}s>o \ This is used in the implementation of the 

hltering-smoothing algorithm presented in the 


Observed distance: 

dt-i 


dt 


t 


t 

Hidden regime: ... — V 

© 

-1 


{7ihk,h,k = 1,...,K) 



1 

Observed direction: 

yt-i 


yt 


t 

\ 

t 

Explanatory variables: 

©-2 

-TO 


Figure 1: Dependence structure of the proposed model 

The proposed model is identihable, in the sense that (up to label switching) 
different values of the parameters will result in different joint distributions for the 
observed data. This is proved using an argument similar to that presented in |18j . 
We can also demonstrate that under the assumption that the hidden process Sq.t 
is ergodic and there is only one hxed explanatory variable {{xt, Zt) = (xq, Zq), t = 
1,..., T} in the directional specihcation, then the observed data process {yo-.r, d^.T) 
is also ergodic; through this condition is sufficient, simulations suggest that it is 
presumanly not necessary. In other words, in the long run (as T gets large), the 
process converges to a stationary distribution. In the case that the angular process 
yt is only predicted by past directions yts-, s = 1,..., t, then the consistency of the 
maximum likelihood estimators of the model parameters (up to label switching) is 
a simple application of Theorem 1 of m, who show the consistency of the MLE 
for auto-regressive processes. 
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3 Inferential Procedures 


We now propose a procedure to estimate 6 = (7r,K, A), where tt is a vector of 
K X {K — 1) transition probabilities, n are the K x {p + 1) unknown parameters for 
the angular model and A are the 2K parameters (scale and shape) of the model for 
the distance variable. Given a series of observations {(?/*, dt, Xt, Zt) ,t = 0,... ,T}, 
we can write the likelihood function of these parameters as a product of the one- 
step ahead predictive densities, see for instance pni. which developed these models 
called “regime switching models” for the applications in econometrics. 


L(») = n E A'‘>)P(Stt = i\TU,e) 


( 8 ) 


t=l 


Where the “predictive” probabilities: t = are evalu¬ 

ated recursively using 1^ in the hltering-smoothing algorithm given on|^ allowing 
the evaluation of ([^. The method used by [10] to evaluate the likelihood of their 
multi-state model and to estimate its parameters can be generalized to hnd the 
parameter values that maximize (|^. However, this method sums the unobserved 
states out of the likelihood and does not allow to predict, at each time point, the 
underlying state of the animal which has interesting ecological applications. This 
is, however, possible with the EM algorithm hence our use of EM to maximize 


3.1 EM Algorithm 

The EM algorithm is generally used for the maximization of likelihood functions 
when some data are missing or unobserved. Here, yo-.r and doiT are the observed 
data and S'o:t is the missing data. The EM algorithm only requires evaluation of 
the complete data log-likelihood function, which in our case is easily derived from 

(§: 

T K K 

log Lcomplete(^7 EEE Sh,t—lSk,t log Qh) 

t=l h=l k=l 
T K 

+ EE Skt log/fc(|/t|Jy_i,K^^^) 

t=l k=l 
T K 

+ EE5‘. log gk{dt\\^'^'>). 

i=0 k=l 

The EM algorithm consists of iterating an expectation (E) and a maximization 
(M) step. Let us denote by Og the value of the estimate of 0 after the s-th iteration 
of the algorithm. Then the (s -|- l)-th iteration of the algorithm starts with one 
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application of the E-step, which evaluates the expectation of log Lcompiete with 
respect to the conditional distribution of the missing values given the observed 
data, as follows: 

Q{6\es) = JlogL complete t)!*^ T’ 

T K K 

= EEE E{Sh,t-iSk,t\J^Ty^s) \ogTihk{nh,qh) 

t=l h=l k=l 
T K 

+ EE E{Skt\J^^, Os) log fk{yt\J^li, 

t=i k=i 
T K 

+ EE E{Skt\T°T,Os)\oggk{dt\X^^^). 

t=0 k=l 

Then the value of 0^+1 is calculated in the M-step as the value of 6 that maximizes 
QieiOs). 


3.1.1 • E step 

The function Q(.|0s) involves two conditional expectations, E{Skt\J^Ty^s) and 
E{Sh,t-iSk,t\J^Ty^s)- These can be efficiently computed by a forward-backward 
(hltering-smoothing) algorithm for Markov chains, see [I5]. The filtering-smoothing 
algorithm starts from the initial time f = 0 and computes the “hltering” probabil¬ 
ities P(S't|Jy") by using predictive probabilities P(S't|Jy°_i) (going forward in time). 
The last filtering probability P(S'T'|.Pr) is then used to compute the “smoothing” 
probabilities P(S't|J^^) using Bayes theorem (going backward in time). We outline 
the details of this implementation of the E-step in the 


3.1.2 • M step 

For the M-step, we see that Q{0\0s) is a sum of three functions that depend on 
different sets of parameters and can thus be maximized separately: 


• When the latent states follow a Markov process, there is a closed form ex¬ 
pression for the maximizer of the hidden process part. 




(s+l) 

hk 


Y:t,E{Sh,t-iSk,t\j^hOs) 


(9) 


which represent the expected number of transitions from state h to state k 
divided by the expected number of transitions leaving from state h. 
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• Since fk{yt\J^t-i) ^as a consensus von Mises density, the log-likelihood for 
the directional part is concave and the maximum is easily calculated; details 
are available in [8]. 

• The maximization algorithm depends on gk{dt\X^^^). For the exponential 
distribution used in the simulation study, the maximizer has a closed form 
expression. For the Weibull or gamma distributions used in the data analysis, 
a numerical maximization (e.g., Newton-Raphson algorithm) of the weigthed 
log-likelihood is needed to compute the estimates. 

Semi-Markov specification 

If the hidden process model is not saturated (e.g., semi-Markov specihcation), 
then cannot be computed explicitly from the function Q{.\6s) and 

numerical maximization is usually needed. 


3.2 Sampling Distributions 


Quantities that are usually required for inference such as the value of the max¬ 
imized log-likelihood for the observed data or an estimation of the variance ma¬ 
trix of 0 are not directly computed when using the EM-algorithm. The hltering- 
smoothing algorithm is used to evaluate the likelihood for the observed data ([^. 
Moreover at each time f, one can evaluate the probability that the animal is in 


state k using the value of E(S'H|.Fr) in the “smooth” part of the algorithm (see 16). 


Because we are able to compute logL(0MLE), we can numerically approximate the 
negative of its Hessian matrix, whose inverse, denoted n, is the usual estimate of 
the variance matrix of the maximum likelihood estimators. A numerical approx¬ 
imation of the Hessian matrix is available under most software implementations 
of the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm ([21]); in the data 
analysis section we use the one provided in the R function optim. 


3.3 Model Selection 

The number of hidden states is usually unknown and its determination is a difficult 
problem. We follow many applications ([ID],[I3|, HBj) to animal or wind movement 
and assume two states for the hidden process. In animal movement studies, a two 
state model is stable and interpretable. The selection of the potential directional 
targets can be done using the classical criteria (AIG, BIG) or Wald’s tests. 
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4 Simulation Study 


This section reports the results of a simulation study that investigates the sampling 
properties of estimators presented in Section We simulated the movement of 
one animal in the plane. The “target” was placed at the center of a map and 
the covariate Xt represents the direction to this target at time step t. Then each 
simulation scenario consisted in repeating the following steps 500 times: (i) a 
two-state Markov chain S'o:t with transition matrix tt is generated; (ii) at time 0, 
the animal is placed at a random position close to the south west corner of the 
map; (hi) at each time step t, t = 1,2 ,..the location of the animal is obtained 
by simulating a direction yt and a distance dt from the Markov switching model 
proposed in Section 2.2 with yt generated according to a consensus von Mises model 
with parameters Kq , and explanatory angles yt-i and Xt and dt is simulated 
from an exponential distribution with mean units; (iv) the simulation stops 
when the animal is within 30 distance units from the target. 

We considered two different simulation scenarios. The values of the parameters 
used in each one are given in TableScenario 1 is one where the animal shows high 
directional persistence and high attraction to the target when in state 1, and high 
directional persistence and a moderate repulsion from the target in state 2. The 
second scenario shows moderate directional persistence and moderate attraction to 
the target when in state 1, and little directional persistence and a weak attraction 
toward the target in state 2. A characteristic trajectory of one simulation under 
the hrst scenario is presented in C.5 since this scenario present attractive and 
repulsive effect of the target on the trajectory of the animal in each state. 


Table 4: Parameters for the two simulation scenarios. 

parameter scenario 1 scenario 2 


P 


-qi ) 

/ 0.9 0.1 \ 

/ 0.6 0.4 

q2 1 - <?2 / 

V 0.2 0.8 J 

V 0.1 0.9 

^(1) 

20 

5 


10 

4.5 

A(1) 

0.7 

2 

J2) 

^0 

15 

2 


-6.5 

0.4 

A(2) 

1.2 

5 


The average length of the series is 533 under the hrst scenario and 695 under the 
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second scenario, the minimum (resp. maximum) was 420 (resp. 630) steps in the 
hrst scenario, for the second scenario the minimum was 580 (resp. 870) steps. The 
model was htted to each simulation sample and the following statistical indicators 
were calculated: 


b{9) 

Sd(0) 


E 



500 


500 


-«) 


i=l 


\ 


500 


499 


- sy 


i=l 


500 


500 




i=l 


( 10 ) 

( 11 ) 

( 12 ) 


with 6'*'*^ the parameter estimate in the Tth simulation and 6 the mean of the 
estimates over the 500 simulations. Equation (10) gives the bias of the estimator, 
01 its standard deviation and ( [I^ is the mean of the standard error estimator. 
We also computed the coverage of the nominal 95% Wald conhdence interval as 
the proportion of times that the true parameter 6 belonged to 0*^*^ ± 1.96n(0^*^). 
The results are summarized in Table IH 
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Table 5: Result of the N = 500 simulations on the two simulation scenarios. 




e 

b{e) 

Sd(0) E 

\/v{0) 

95%interval 

coverage 


Qi 

0.1 

0.002 

0.020 

0.021 

0.950 


(12 

0.2 

-0.003 

0.039 

0.039 

0.954 



20 

0.259 

1.675 

1.718 

0.958 

scenario 1 


10 

0.125 

1.089 

1.051 

0.960 

{n = 533) 

Ad) 

0.7 

-0.002 

0.038 

0.037 

0.950 


fi-d) 

15 

0.322 

2.065 

1.944 

0.950 


^(2) 

-6.5 

-0.098 

1.158 

1.113 

0.940 


A(2) 

1.2 

0.014 

0.110 

0.111 

0.952 


Qi 

0.4 

-0.008 

0.060 

0.058 

0.950 


(12 

0.1 

-0.002 

0.020 

0.022 

0.958 


fi-d) 

5 

0.137 

0.864 

0.804 

0.950 

scenario 2 


4.5 

0.136 

0.888 

0.853 

0.940 

{n = 695) 

Ad) 

2 

-0.003 

0.162 

0.167 

0.956 


fi-d) 

2 

0.003 

0.071 

0.075 

0.966 


^(2) 

0.4 

-0.007 

0.068 

0.072 

0.958 


A(2) 

5 

0.010 

0.185 

0.186 

0.952 


Table l^shows that the inferential procedure presented has good statistical prop¬ 
erties. First the maximum likelihood estimator appears to be unbiased {b{6) ~ 0) 
for the coefficient of distance and for the transition probabilities. Estimates of 
the K parameters exhibit a weak bias, which is higher in the hrst scenario, a phe¬ 
nomenon well known for the von Mises distribution (ini)- The quasi equality 


between Sd(0) and E 


indicates that our variance matrix approximation 


of the variance of the parameter estimates is correct. Finally, the 95% Wald con- 
hdence intervals based on v have an empirical coverage rate that is approximately 
equal to 95%. 


5 Application to the Analysis of the Movement 
of Caribou 

We now apply the model using data collected by [2] about the movement of forest 
caribou in the Cote-Nord region of Quebec, Canada. This involved 23 animals 
with different home ranges. The proposed two state model htted well the data of 
several animals. We observed strong variation among animals so in this section we 
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focus on a single animal, wearing a collar recording its locations every four hours, 
observed in the 2006-7 winter period. Its position was observed at T = 617 time 
points and Figure [^shows its trajectory, with distance expressed in meters, during 
the observation period. In Figure]^ the distances traveled in four hour intervals 
are generally small and the caribou is mostly observed around two different sites. 
Because the individual travels between sites, two different states, encamped and 
traveling, might be envisaged. The animal movement mostly takes place in the 
NE-SW direction. 



Figure 2: Caribou trajectory from December 28 2005 to April 15 2006. The 
red circle corresponds to the start and the blue one to the end of the observed 
trajectory. The yellow color corresponds to the locations with regenerating cuts. 

Besides directional persistence, yt-i-, several explanatory angles were considered 
in the analysis. The ones kept for the hnal model were Xcut, the direction to the 
closest regenerating cut (i.e., a forest stand that has been cut between 5 and 20 
years ago, the yellow in Figure [^, and the direction to the centroid of a cluster 
of recently visited locations Xcenter- At time t the locations visited by the animal 
between times 0 and t — 1 are put in clusters. Cluster 1 is the set of locations 
visited between time 0 and ti. The cluster ends at time ti means that the distance 
between the position at time ti -|- 1 and the centroid of the cluster is, for the 
hrst time, larger than a fixed number D = 1.6 km (see the Appendix of [2] for 
more details). In the same way the second cluster is made of the locations visited 
between time ti -|- 1 and t 2 and so on. At time t, the average locations for all 
clusters are calculated and the cluster whose average location is closest to the 
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current animal position is used to compute ^center as the direction to the average 
location of the closest cluster. The exploratory analysis in the reveals that the 
relative appeal for all these targets varies with the distance traveled. This suggests 
fitting a two state model to the data. 

5.1 Analysis with a Two-State Model 

We now fit the proposed model with K = 2 states featuring directional persistence 
and the two explanatory angles, a;cut and Xcenter, presented above. The distance 
traveled was htted using the Weibull and gamma distributions, both AIC and 
BIC criteria select the gamma distribution (AlCweibuii = 1620.60 and BICweibuii = 
1673.70). Two models, with a Markov and a semi-Markov specihcation for the 
hidden process, are considered. The results are summarized in Table 

Table 6: Estimation of the parameters of multistate models, with gamma dis¬ 
tributed distances; and denote the shape and scale parameters of the 
gamma distribution in the state k; Uk and qk are the size and probability of the 
negative binomial distribution of the state fc’s dwell time. 




Markov 

Semi-Markov 


Estimate 

S.e. 

Estimate 

S.e. 

qi 

0.2799 

0.0891 

0.1290 

0.1087 

ni 

1 


0.3046 

0.2529 

(l2 

0.0231 

0.0097 

0.0157 

0.0124 

n2 

1 


0.6034 

0.3828 

'^persist. 

^(1) 

'^center 

1.2666 

0.3732 

0.3327 

0.2994 

1.2617 

0.4251 

0.3368 

0.2995 


0.1601 

0.3013 

0.1274 

0.3043 

aP 

0.6477 

0.1372 

0.6670 

0.1366 

a(^) 

3.0444 

0.8450 

2.9361 

0.8122 

'^persist. 

'^center 

0.0274 

0.2590 

0.0618 

0.0694 

0.0263 

0.2563 

0.0619 

0.0695 

^(2) 

^cut 

0.1454 

0.0679 

0.1462 

0.0698 

Af> 

1.2626 

0.0774 

1.2704 

0.0779 

aP 

0.1351 

0.0119 

0.1332 

0.0119 

(/,AIC,BIC) 

(-794.91,1613.82,1666.91) 

(-793.11,1614.22,1676.17) 


In Table 1^ state 1 is a traveling mode with a large estimate for the average dis- 
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^ [K) ^( k ; 

tance traveled A 2 , while state 2 is encamped. Most of the data points are in 
state 2 and the information for the estimation of the state 1 parameters is limited. 
The hnal model for data interpretation, selected as the one with the smallest BIG, 
is the one with a Markovian hidden process. Table [^shows two different regimes for 
the direction and the traveled distance between steps. In the hrst state, the cari¬ 
bou has a strong signihcant directional persistence (kp^^sist. > 1-26, s.e. = 0.33). 
The two K parameters for the environmental targets are positive; however, because 
of the limited data available in state 1, their standard errors are large and these 
parameters are not signihcantly different from 0. In state 1, the animal moves at 

an average speed of a|^ ^A^ ^4 ~ 493 m per hour. 

In the second regime the animal is almost stationary, moving by about 43 m 

(a[ ^A 2 V 4 ) per hour. The directional persistence parameter ^persist, 
nihcantly different from zero. The parameters for the two environmental targets 
are signihcantly larger than 0, suggesting that the caribou is attracted by loca¬ 
tions previously visited and by the closest regenerating cut. The interpretation of 
the signihcant parameter for wood regenerating cuts is challenging as the animal 
never reaches them (Figure]^. It could be an accidental relationship: when in 
the encamped state in the north-east corner of the map, the caribou moves in 
the north-south direction and the wood regenerating cuts happen to be located 
south of this area. Considering |22], the wood regenerating cuts could also act 
as a proxy for the center of the animal’s current home range, the area where the 
caribou relocated after the cuts. 

The stationary distribution of the latent htted Markov chain is in state 1 with 
probability g 2/(<?2 + <?i) = 0.076, showing that the caribou was observed traveling 
about 47 out of T = 617 sightings. This explains why the parameter estima¬ 
tors have a low precision in state 1. The state of the localization at time t can 
be identihed using the smooth probabilities P(S'tfc = 1| t = 1,...,T, 

k = 1,2 calculated in the hltering-smoothing part of the E-step of the EM al¬ 
gorithm. This is depicted in Figure with a color gradient from red (state 1, 
P(S'ti = 1 |Jt;^mle) = 1) to blue (state 2, F{St 2 = 1|^t;^mle) = !)• 
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Figure 3: Hidden state probabilities at each time step of the trajectory of the 
caribou. 


j2] analyzed these data with single state models. They found directional per¬ 
sistence to be an important determinant of caribou movement. Our analysis com¬ 
plements theirs as we show that this variable is important only when the caribou 
travels between sites (i.e., when the animal is in state 1). For more than 90% of 
the observation times, the caribou was in state 2 (encamped mode) and its move¬ 
ment was not determined by directional persistence. The angular analysis clearly 
identihed two environmental features, regenerating cuts and previous locations 
visited, that influenced (attraction effect) caribou movement. Plots produced by 
our method, such as the one shown in Figure]^ can thus be used to identify areas 
of inter-patch movements (state 1, red) or of residency (state 2, blue) which tend 
to be studied with two different models in ecological research (e.g., |23], [21] and 

H). 


Section 4 shows that the proposed model describes well the motion of animal 
going towards a target. The data analyzed in this section, see Figure 2, is about 
an animal going back and forth between two targets. Is model ([^ suitable for such 


data? C.6 shows that it is, through simulations. If an animal is attracted by two 
targets and if the model at step t shows a directional bias only towards the closest 
one, then (1^ describes the motion of an animal moving back and forth between 


the two targets. See C.6 for details. 


6 Conclusion 

This paper proposes a multi-state model with a general directional specihcation 
to describe the movement of an animal. It improves on classical BCRW because it 
allows the animal to exhibit several movement behaviors. It is a general method 
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to determine different behaviors, as it can handle and reveal the response of an 
animal to an arbitrary number of environmental features. The method generalizes 
the contribution of ([lO]) whose statistical model, based on HMMs, only permits 
the inclusion of one environmental characteristic in the analysis. 

In ecology, the understanding of the interplay between animal movement and 
habitat heterogeneity, including the characterization of strategies that animals use 
to locate sites for forage and safety, has been a long standing problem ([2S])- By 
using recent techniques for the implementation of the EM algorithm in complex 
settings, we provide new statistical tools to identify these hidden animal behav¬ 
iors. The appeal of these new methods is that they can handle an arbitrary number 
of explanatory variables associated to the animals’ environment. These explana¬ 
tory variables can be directions to environmental features, or continuous variables 
accounting for a time effect of some habitat characteristics. See [8] for a more 
general discussion of single-state angular regression models featuring both angular 
and continuous explanatory variables. 

The directional component of the proposed angular-linear random walk pro¬ 
cess has several important properties. It is based on a sound statistical method to 
combine an arbitrary number of explanatory variables. From a numerical point of 
view, the M-component of the EM algorithm is simple as it involves the maximiza¬ 
tion of several convex functions, at least in the Markovian case. This is so because 
of the consensus error specihcation for the angular part of the model. A regression 
parameter k is easily interpreted as the relative appeal of a particular target when 
compared to the others. The k estimates allow an accurate characterization of 
the different behavioral modes of an animal, as illustrated in the analysis of the 
caribou data. 

[27] point out that because of the heterogeneity of the landscape, animals have 
to move through various types of areas that are more or less suitable for their 
current needs. They propose a technique to identify intensively used areas based 
on distance traveled. The smooth probabilities obtained through the EM algorithm 
are an alternative to identify these intensively used places that takes into account 
both the trajectory and the landscape heterogeneity. 

Technological advances in satellite telemetry, such as Argos archival data log¬ 
gers, have allowed researchers to track animal movements and behavior in envi¬ 
ronments that are difficult to study like marine systems (123.123). Measurement 
errors in the locations acquired with this technology should be taken into account 
([5], [29]). In our application, the land animal is followed using the more accurate 
GPS technology. In this case, the errors in the locations are small in comparison to 
the image resolution, and this translates into errors in the directions and distances 
between locations and targets that are negligible. 

There are several possibilities for further extension of the method presented 


20 


here. Because caribou exhibit heterogeneity in their movement behavior, the si¬ 
multaneous analysis of the movement of many individuals would require a model 
with random effects. Including these in the proposed multi-state model should be 
relatively straightforward, but adapting the numerical procedure appears difficult. 

When a semi-Markov structure is assumed for this hidden process, the Markov 
approximation considered here involves a large number of states and is computa¬ 
tionally demanding. Or when trying to model the behavior of an animal over a 
long period of time (e.g., more than one “biological season”, see EDI). the time 
homogeneity assumption can be unreasonable. Hence defining a directional model 
based on a more complex hidden process could be an extension. Though the nu¬ 
merical algorithm proposed here works really well when the hidden process is a 
time-homogeneous Markov chain, a new numerical approach would presumably be 
required if a different hidden process were assumed. 
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A Filtering-Smoothing Algorithm for the Markov 
Specification 

In the E-step of the (s-M)-th iteration of the EM algorithm we have to compute 
two posterior expectations involving the hidden Skt, k = 1,..., K, t = 0,... ,T, 
conditionally on the observed data 

E{Skt\J^^,6s) = F{Skt = l\J^^,e,) (13) 

E{SH,t-iSk,t\J^^,es) = nSh,t-i = i\Skt = CJ^T,0s)nSkt = i\J^T.0s)A^4:) 


where Og is the maximized vector of parameters after the s-th step of the EM 


algorithm. The first probability on the RHS of (14) can be computed with Bayes’ 


theorem because, as we can see from Figure]^, St-i is independent of the observed 
data from time t to T (i.e. \ given St and Dt_i ■ 




p(Si,,-i = i|SH = iW?,e.) = 


k = 1,... ,K, t = 0,...,T. 


Finally, to compute the remaining conditional probabilities in the posterior expec¬ 


tations (13) and (14), we adapt the classical filtering-smoothing algorithm of 
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Filtering-smoothing algorithm to implement the E-step 

OF THE (s 1)-TH ITERATION OF THE EM ALGORITHM. 


Filter Compute F^Su = Og), for every / = 1,..., : 


Etifkiyt\J^?-^0s)9kidueg)F{Skt = i\J^ueg 

where F{Si^i = l\T°,6s) = Ylk=i 
and 


K 


= i\j^uos) = J2nfnSk,t-i = 

fc=i 

for t = 2,... ,T. 


(15) 


Smooth Compute = ll-Fr, ^s), for every I = 1,... ,K: 


S-Step 1 For t = T, set f‘{SiT = II-Ft) ^s), the conditional probability computed 
at the last hltering step. 

S-Step 2 Recursion: For t = T — 1,..., 0, compute: 


F{Su = l\J^^,0g) = 


k=l 


Ik 


'¥{Sit = l\J^^,eg)¥{Sk,t+i 


— \ Og) 






(16) 


B Markov Approximation of a Semi-Markov Pro¬ 
cess 


We follow the idea of approximating a semi-Markov process by an extended Markov 
chain as introduced by [I9]. We present here the details for a two state process. 

Let us suppose that Sq-,t is a two state semi-Markov process with Qi and Q 2 the 
probability mass functions of the dwell times in both states. Then using notation 
of Section 2.3, rih, denote the parameters (size and probability) of the dwell time 


distribution in state h, Q^. The stochastic behavior of this process is approximated 
by a Markov chain Sq-,t with state space {(i, fc), i = 1, 2, fc = 1,..., mj}, where i 
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still denotes the state of the animal and k is the dwell time in this state. Given that 
St = (i, k), two moves are possible: one is to go to state (3 — 1) with probability 

'^{i,k)(3—i,l) Qi) Q ii^k^ / {Q ii^k^ Q i(^k ^, 

or to state k + 1)) with probability 1 — vr(j^fc)( 3 _j^i)(nj, g*). In applications 

the Qi are often chosen to be the probability mass functions of negative binomial 
distributions and in this case, typical values of the mds are around 30. 

Figure B.4 is an example of the transition matrix contructed as exposed with 
two states and mi = 4, m 2 = 4, it means that the enlarged Markov chain Sq^t is 
an 7 state Markov chain. We also give the basic example of mi = m 2 = 1, which 
reduces to the classical Markov chain model; 

( 1 , 1 ) ( 2 , 1 ) ( 1 , 1 ) ( 2 , 1 ) 

(1,1) A-7r(i,i)(2,i)(l,gi) 7r(i,i)(2,i)(l,gi) ^ (1,1) /1 - gi qi \ 

(2, 1) \ 7J'{2,1)(1,1)(1, 5'2) 1 — 7’'(2,1)(2,2)(1, Q' 2 ) / (2, 1) \ ^2 1 ~ 5'2 / 
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(1,1) (1,2) (1,3) (1,4) (2,1) (2,2) (2,3) 

0 1 - 7r(i,i)(2,i)(^^i,gi) 0 0 7r(i,i)(2,i)(ni,gi) 0 0 
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Numerical Details 


C.l Finding the Global Maximum of the Likelihood Func¬ 
tion 

Due to the complexity of the model, the EM algorithm may converge to local 
or spurious maxima of the likelihood function. We have observed that the EM 
algorithm can quickly converge to spurious maxima (in less than 10 iterations). 
This is due to the fact that for some parameters in k, X and tt, the likelihood is 
unbounded, a common phenomenon in the case of mixture models |31j . 

To deal with this, we run the EM algorithm with many random starting values 
for a few iterations and check for spurious and local solutions. We then choose the 
parameter values that yield the highest likelihood as the starting point of a new 
EM algorithm that we run until convergence. This strategy of combining short- 
and long-run EM algorithms to avoid possible local and spurious maxima is known 
as the lem-EM algorithm ([S2])- To obtain an estimate of the variance matrix of 
the maximum likelihood estimator we run one iteration of the quasi-Newton al¬ 
gorithm to obtain the value of the inverse of the Hessian matrix of the observed 
log-likelihood function evaluated at the maximum likelihood estimates. Here is an 
algorithmic description of this procedure. 


Finding the global maximum of the observed log-likelihood fung- 

TION 


Preliminary step : 

• Let 6i,... ,6 n be N random initial starting values. (In our application of 
this method, we chose N = 50.) 

• For i = 1,... ,N, run the EM algorithm until the hrst of (i) 50 iterations 
or (ii) the greatest relative difference in parameter value between successive 
iterations is less than 1%. Denote the estimators obtained at the end of this 
step Oi, i = 1,..., N. 

Avoid spurious maxima : 

• For each 6i, i = 1,..., N, compute the stationary distribution of the Markov 
chain, k = 1,..., K. 

• Only keep the {Oi,i El} such that 

min > e and max < M. 

k=l,...,K j=l,...,p;k=l,...,K 

(In our application of this method, we chose e = 0.001 and M = 100.) 
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Avoid local maxima : 


• Put 6q = argmaxjg/L(0j). 

Long-run EM algorithm ; 

• Start the EM algorithm at 6q and run it until the first of (i) 10 000 iterations 
or (ii) the greatest relative difference in parameter value between successive 
iterations is less than 10“®. 

Quasi-Newton iteration ; 

• Run one iteration of the quasi-Newton algorithm with the output of the 
long-run EM algorithm as initial value to get the hnal global maximum 
likelihood estimators of the model parameters and an estimation of their 
variance matrix. 


C.2 A Note on the Initial Distribution (7ro)/c, /c = 1,..., A 

Calculation of either the observed or complete data likelihood involves the initial 
distribution of the Markov chain, (tto)*:, k = We have decided to £t 

the model twice. For the first fit we use (7ro)fc = 1/iP, /c = 1,..., iP, the uniform 
distribution over the K states. Then we fit the model again, but this time with 
(7ro)A: = kk, the stationary distribution of the chain computed from the first model 
fit. 

C.3 A Note on the Identifiability of the Model up to State 
Label Switching 

We can easily see that the value of the likelihood function remains the same if 
we relabel the states. We therefore define the states at the end of each M-step as 
follows: we give label i to the state with the f-th smallest kg . 

C.4 Multi modality of the log-likelihood function with ho¬ 
mogeneous errors 

In this section we want to illustrate the multi modality of the log-likelihood of the 
biased correlated random walk model with homogeneous errors in a very simple 
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example. Let us suppose that the Markov chain has K = 2 states and dehne the 
vector 


vi’"'’ = 1 X 


cos{yt-i) \ ^ .(fc) / cos(a;i) 
sm{yt-i) J I sin(a;t) 




(17) 


where Xt is uniformly distributed on the unit circle, = —0.2, = 0.7 and 

let us denote /i|^^ and 4^^ the direction and the length of the vector (0. The 


,{k) 


and 


direction yt is simulated as a von Mises distribution with mean direction p) 
concentration with = 0.1, = 0.3 . 

The distances are simulated using different gamma distributions in each state 
= 0.5, = 2 and A^^ = 1, A^^^ = 0.2) and the Markov chain’s parameters 

are gi = 0.1 and q 2 = 0.2. All the parameters are assumed to be known except for 
/3i and (32- Figure gives contour plots of the log-likelihood for (/di,/32) obtained 
with a simulated sample of size n = 50. It shows two local maxima, around (/3i, /32) 
equal to (—1.5,0) and (—1, .5). Such a multimodal log-likelihood was obtained 4 
times out of 20 simulations. 



betal 


Figure 5: The log-likelihood function of the parameters {I3^^\ (3^“^'^) with sample 
size n = 50. 


For standard angular regression problems, the concensus model has a concave 
log-likelihood, see Section 2^, that is easily maximized. We expect that these good 
numerical properties will also apply to the general random walk directional model 
introduced in Section |2.2[ Actually, such a unimodal log-likelihood was obtained 


20 times out of 20 simulations. 
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C.5 Characteristic Trajectory Under First Scenario of Sim¬ 
ulation 



Figure 6: Characteristic trajectory of one simulation under scenario 1 in Table 
The first state {k = 1) is coloured in red, while the second state {k = 2) is blue. 
The black cross is the starting point and the green cross is the target. 

C.6 Validation 

In order to compare the observed trajectory to trajectories simulated from the 
fitted model, we first notice that the animal has two symetrical sub-trajectories. 
In the first one, the animal is going from the departure (red circle in Figure 
to the bottom left corner of Figure In the second one, the animal returns to 
a hnal point (blue circle in Figure [^) that is close to the starting point of the 
first sub-trajectory. This symetry means that we can save computing time by 
only comparing trajectories simulated by the model to one of the two observed 
sub-trajectories; we chose to compare them to the first observed sub-trajectory. 

To make the computational load manageable, we simulate trajectories using a 
slightly modified model: 

• we drop the variable Xcut since it would take too much time to recompute at 
each simulated time step and its effect in the model is weak; 

• we simplify the variable Xcenter, which would also take time to recompute, 
and redefine it as the direction to the centroid of the locations in the bottom 
left corner of Figure]^ 

Except for these simplifications the trajectories are simulated using the esti¬ 
mated parameters in Table We simulate N = 500 trajectories to produce the 
following figures and statistics. Figure and Figure depict, respectively, the 
empirical cumulative distribution function and histogram of the number of time 
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steps required to return to circle of diameter 2 km around of the centroid of the 
locations in the bottom left corner. We stop the trajectory after 10,000 steps if it 
does not reach the neighborhood (this happened only in 9 of the 500 simulations, 
so less than 2% of the times. 



0 2500 50*X 7500 100X 

time steps 

Figure 7: Histogram of the number of 
time steps needed for the simulated 
trajectories to reach a neighborhood 
of the centroid of the locations in the 
bottom left corner of the map. 



time steps 

Figure 8: Cumulative of the number 
of time steps needed for the simu¬ 
lated trajectories to reach a neighbor¬ 
hood of the centroid of the locations 
in the bottom left corner of the map. 


Table gives estimates of the probabilities that the animal arrives in a neigh¬ 
borhood of the centroid of the locations in the bottom left corner of the map based 
on the 500 simulations. 


Table 7: Quantiles of the return’s time steps distribution. 


time steps 

500 

1000 

2000 

5000 

10000 

probability 

0.098 

0.25 

0.472 

0.826 

0.982 


Finally, we show some randomly chosen trajectories from the simulation model. 
The color corresponds to the same state presented in the analysis of the caribou 
data in Section [^i.e., red for the exploratory state and blue for the encamped one. 
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D Descriptive Statistics on the Trajectory of the 
Caribou 

We present an exploratory analysis of the trajectory data of the caribon. These 
data consist of 617 observations of the form (?/*, dt, Xt), where yt and dt respectively 
represent the direction (bearing) and the distance between the caribou location 
at time step t and time step t + 1 and Xt are the values of hve exploratory angle 
variables. Besides directional persistence, yt-i, several explanatory angles were 
considered in the analysis. The ones kept for the hnal model are Xcut, the direction 
to the closest regenerating wood cut (that is an area where wood had been cut 
between 5 and 20 years ago), and the direction to the centroid of the closest patch 
Xcentre- A patch is an aggregation of locations recently visited by the animal; at a 
given step, the center of gravity of the patch’s locations is recalculated and a;centre 
is the angle of the segment joining the animal’s position and the updated center 
of gravity; see the Appendix of j2] for more details. In ecology, such a variable is 
called spatial memory effect, because animals tend to come back to familiar areas 
for forage and safety. 

D.l Observed Trajectory 



Figure 9: Caribou trajectory from December 28 2005 to April 15 2006 . 
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Figure shows the trajectory of the caribou. It strongly suggests two different 
movement behaviours: an “encamped” state characterized by short traveled dis¬ 
tances between two steps and an “exploratory” state with longer traveled distances. 

D.2 Partitioning the Data According to the Distance Trav¬ 
eled 

To assess whether the directionality of the steps is different between the “en¬ 
camped” and the “exploratory” states, we want to partition the dataset into two 
subsets: one with “ short” traveled distances and one with “long” traveled dis¬ 
tances. The cut point between “short” and “long” distances has to be precisely 
defined. 


C4 travelled distance 



Distance (km) 


Figure 10: Histogram of travelled distances of the caribou from December 28 
2005 to April 15 2006. 


Table 8: Summary statistics of the travelled distance (km). 


n Min 

Qi 

Median 

Mean 

Qs 

Max 

617 0.0015 

0.0642 

0.1259 

0.312 

0.249 

10.04 


The histogram shown in Figure IT and the summary statistics from Table 
indicate that most of the traveled distances are “short”, but with a non-negligible 
proportion of much longer distances. We therefore £t a mixture of exponential 
distributions to the sample of traveled distances and obtain the results given in 
Table [H 
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Table 9: Fitted mixture of exponential distribution on the distance. For i = 
1, 2, vTj and A* respectively represent the weight and the mean of the exponential 
distribution i. _ 


TTl 

Ai 

T^2 

A 2 

0.09 

0.4602 

0.91 

7.3119 


Finally, we can dehne a critical distance (the cutoff point between the “en¬ 
camped” and the “exploratory” states), ^critical, the weighted mean of the two 
exponential distributions in the mixture model; 

^critical = ^ + V ~ 0.3120 km. 

Ai A2 

This partitions the dataset into two subsets: one subset with 500 observations 
with distances traveled less than dcriticai and one subset with 117 observations 
with traveled distances greater than dcnticai- 

D.3 Directionality of Movement Conditional on Distance 
Traveled 

We can now analyze the distribution of the turning angles {yt — yt-i, t = 1,... ,T) 
in the two subsets of the dataset dehned by dcriticai- 


Table 10: Exploratory analysis of the distribution of the turning angles in each 
subset of the data dehned by dcnticai- 



Distance traveled < ^critical 

Distance traveled > ^critical 


t 1 


Circular histogram 


vx 

Subset size 

500 

117 

Mean direction (Rad) 

1.933 

0.40 

Resultant length 

0.0253 

0.3199 

Pvaiue, Kuiper’s test of uniformity 

> 0.15 

< 0.01 


Table IT shows that when the distance traveled exceeds dcriticai, the animal 
path tends to be a concatenation of straight lines, as the turning angles are not 
uniformly distributed over the circle, but rather concentrated around direction 0. 
But when the distance traveled is less than dcriticai, uniformity of the distribution 
of the turning angles is not ruled out, implying that the animal tends to turn 
around and does not have a preferred direction. 
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The final step of our exploratory analysis is to see if the environmental tar¬ 
gets have the same influence on the animal’s movement in the “encamped” and 
“exploratory” states. To do so, we have fitted to the entire dataset our proposed 
model with K = 1 state and the vector in the directional model (see (7) in the 
main paper) as 


Vt = 


J sm(a;cut) / ' 

^ cos(?/t_i) \ / cos(a;cut) 

sin(|/t_i) J ^ sin(a;cut) 


X Zt 


tiin^O/centery 

+ X Zt 


COs(Xcenter) 

sin(Xcenter) 


(18) 


with Zt take on value 1 if d* > dcriticai and value 0 otherwise, for f = 0,... ,T. 
After a backward selection of the variables based on Wald tests with significance 


levels of 0.05, we have obtained the model summarized in Table 11 These results 


agree with the observations from Table 10 and suggest that the directionality of 
the movement of the animal is different in the two subsets of the data defined by 

dcritical- 


Table 11: Parameter estimates of the final model for the directional model with 
directional mean vector given by (18) fitted to the entire dataset. 



Ko 

Ki 

^2 

«3 

Estimate 

0.0300 

0.1457 

0.2687 

0.4176 

S.e. 

0.0635 

0.0642 

0.0635 

0.1515 
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